Search CORE

69 research outputs found

Embedding cube-connected cycles graphs into faulty hypercubes

Author: Bruck Jehoshua
Cypher Robert
Soroker Danny
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/1994
Field of study

We consider the problem of embedding a cube-connected cycles graph (CCC) into a hypercube with edge faults. Our main result is an algorithm that, given a list of faulty edges, computes an embedding of the CCC that spans all of the nodes and avoids all of the faulty edges. The algorithm has optimal running time and tolerates the maximum number of faults (in a worst-case setting). Because ascend-descend algorithms can be implemented efficiently on a CCC, this embedding enables the implementation of ascend-descend algorithms, such as bitonic sort, on hypercubes with edge faults. We also present a number of related results, including an algorithm for embedding a CCC into a hypercube with edge and node faults and an algorithm for embedding a spanning torus into a hypercube with edge faults

Caltech Authors

Fault-tolerant meshes and hypercubes with minimal numbers of spares

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/1993
Field of study

Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Two- and three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-and-conquer algorithms requiring global communication. However, even a single faulty processor or communication link can seriously affect the performance of these machines. This paper presents several techniques for tolerating faults in d-dimensional mesh and hypercube architectures. Our approach consists of adding spare processors and communication links so that the resulting architecture will contain a fault-free mesh or hypercube in the presence of faults. We optimize the cost of the fault-tolerant architecture by adding exactly k spare processors (while tolerating up to k processor and/or link faults) and minimizing the maximum number of links per processor. For example, when the desired architecture is a d-dimensional mesh and k = 1, we present a fault-tolerant architecture that has the same maximum degree as the desired architecture (namely, 2d) and has only one spare processor. We also present efficient layouts for fault-tolerant two- and three-dimensional meshes, and show how multiplexers and buses can be used to reduce the degree of fault-tolerant architectures. Finally, we give constructions for fault-tolerant tori, eight-connected meshes, and hexagonal meshes

Caltech Authors

Fault-tolerant meshes with minimal numbers of spares

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/1991
Field of study

This paper presents several techniques for adding fault-tolerance to distributed memory parallel computers. More formally, given a target graph with n nodes, we create a fault-tolerant graph with n + k nodes such that given any set of k or fewer faulty nodes, the remaining graph is guaranteed to contain the target graph as a fault-free subgraph. As a result, any algorithm designed for the target graph will run with no slowdown in the presence of k or fewer node faults, regardless of their distribution. We present fault-tolerant graphs for target graphs which are 2-dimensional meshes, tori, eight-connected meshes and hexagonal meshes. In all cases our fault-tolerant graphs have smaller degree than any previously known graphs with the same properties

Caltech Authors

Wildcard dimensions, coding theory and fault-tolerant meshes and hypercubes

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

Hypercubes, meshes and tori are well known interconnection networks for parallel computers. The sets of edges in those graphs can be partitioned to dimensions. It is well known that the hypercube can be extended by adding a wildcard dimension resulting in a folded hypercube that has better fault-tolerant and communication capabilities. First we prove that the folded hypercube is optimal in the sense that only a single wildcard dimension can be added to the hypercube. We then investigate the idea of adding wildcard dimensions to d-dimensional meshes and tori. Using techniques from error correcting codes we construct d-dimensional meshes and tori with wildcard dimensions. Finally, we show how these constructions can be used to tolerate edge and node faults in mesh and torus networks

CiteSeerX

Caltech Authors

Tolerating faults in hypercubes using subcube partitioning

Author: Bruck Jehoshua
Cypher Robert
Soroker Danny
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/1992
Field of study

We examine the issue of running algorithms on a hypercube which has both node and edge faults, and we assume a worst case distribution of the faults. We prove that for any constant c, an n-dimensional hypercube (n-cube) with n^c faulty components contains a fault-free subgraph that can implement a large class of hypercube algorithms with only a constant factor slowdown. In addition, our approach yields practical implementations for small numbers of faults. For example, we show that any regular algorithm can be implemented on an n-cube that has at most n-1 faults with slowdowns of at most 2 for computation and at most 4 for communication. To the best of our knowledge this is the first result showing that an n-cube can tolerate more than O(n) arbitrarily placed faults with a constant factor slowdown

Caltech Authors

Tolerating faults in a mesh with a row of spare nodes

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/1992
Field of study

We present an efficient method for tolerating faults in a two-dimensional mesh architecture. Our approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of faults. We optimize the cost of the fault-tolerant mesh architecture by adding about one row of redundant nodes in addition to a set of k spare nodes (while tolerating up to k node faults) and minimizing the number of links per node. Our results are surprisingly efficient and seem to be practical for small values of k. The degree of the fault-tolerant architecture is k + 5 for odd k, and k + 6 for even k. Our results can be generalized to d-dimensional meshes such that the number of spare nodes is less than the length of the shortest axis plus k, and the degree of the fault-tolerant mesh is (d-1)k+d+3 when k is odd and (d-1)k+2d+2 when k is even

Caltech Authors

CCL: a portable and tunable collective communication library for scalable parallel computers

Author: Alex Ho
Ching-tien Ho
Jehoshua Bruck
Marc Snir
Pablo Elustondo
Robert Cypher
Senior Member
Senior Member
Shlomo Kipnis
Vasanth Bala
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

A collective communication library for parallel computers includes frequently used operations such as broadcast, reduce, scatter, gather, concatenate, synchronize, and shift. Such a library provides users with a convenient programming interface, efficient communication operations, and the advantage of portability. A library of this nature, the Collective Communication Library (CCL), intended for the line of scalable parallel computer products by IBM, has been designed. CCL is part of the parallel application programming interface of the recently announced IBM 9076 Scalable POWERparallel System 1 (SP1). In this paper, we examine several issues related to the functionality, correctness, and performance of a portable collective communication library while focusing on three novel aspects in the design and implementation of CCL: 1) the introduction of process groups, 2) the definition of semantics that ensures correctness, and 3) the design of new and tunable algorithms based on a realistic point-to-point communication model

CiteSeerX

Caltech Authors

Lessons From the Pandemic: Engaging Wicked Problems With Transdisciplinary Deliberation

Author: Coleman Miles C
Cypher Joy M
Fleming Robert
Krummenacher Claude
Santos Susana C.
Publication venue: ScholarWorks at WMU
Publication date: 01/10/2021
Field of study

Some crises, such as those brought on or exposed by the COVID-19 pandemic, are wicked problems—large, complex problems with no immediate answer. As such, they make rich centerpieces for learning with respect to public deliberation and issue-based dialogue. This essay reflects on an experimental, transdisciplinary health and science communication course entitled Comprehending COVID-19. The course represents a collaborative effort among 14 faculty representing 10 different academic departments to create a resource for teaching students how to deliberate the pandemic, despite its attending, oversaturated, fake-news-infused, infodemic. We offer transdisciplinary deliberation as a pedagogical framework to expand communication repertoires in ways useful for sifting through the messiness of an infodemic while also developing key deliberation skills for productively engaging participatory decision-making with concern to wicked problems

Directory of Open Access Journals

ScholarWorks at WMU

Fault-Tolerant Meshes with Small Degree

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/1994
Field of study

This paper presents constructions for fault-tolerant two-dimensional mesh architectures. The constructions are designed to tolerate k faults while maintaining a healthy n by n mesh as a subgraph. They utilize several novel techniques for obtaining trade-offs between the number of spare nodes and the degree of the fault-tolerant network. We consider both worst-case and random fault distributions. In terms of worst-case faults, we give a construction that has constant degree and O(k to the power of 3) spare nodes. This is the first construction known in which the degree is constant and the number of spare nodes is independent of n. In terms of random faults, we present several new degree-6 and degree-8 constructions and show (both analytically and through simulations) that they can tolerate large numbers of randomly placed faults

Caltech Authors

Fault-tolerant de Bruijn and shuffle-exchange networks

Author: Bruck Jehoshua
Cypher Robert
Ho Ching-Tien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/1994
Field of study

This paper addresses the problem of creating a fault-tolerant interconnection network for a parallel computer. Three topologies, namely, the base-2 de Bruijn graph, the base-m de Bruijn graph, and the shuffle-exchange, are studied. For each topology an N+k node fault-tolerant graph is defined. These fault-tolerant graphs have the property that given any set of k node faults, the remaining N nodes contain the desired topology as a subgraph. All of the constructions given are the best known in terms of the degree of the fault-tolerant graph. We also investigate the use of buses to reduce the degrees of the fault-tolerant graphs still further